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Abstract — The capacity region of the multiple access channel 
with arbitrarily correlated sources remains an open problem. 
Cover, El Gamal and Salehi gave an achievable region in the 
form of single-letter entropy and mutual information expressions, 
without a single-letter converse. Cover, El Gamal and Salehi also 
gave a converse in terms of some n-letter mutual informations, 
which are incomputable. In this paper, we derive an upper bound 
for the sum rate of this channel in a single-letter expression 
by using spectrum analysis. The incomputability of the sum 
rate of Cover, El Gamal and Salehi scheme comes from the 
difficulty of characterizing the possible joint distributions for the 
n-letter channel inputs. Here we introduce a new data processing 
inequality, which leads to a single-letter necessary condition for 
these possible joint distributions. We develop a single-letter upper 
bound for the sum rate by using this single-letter necessary 
condition on the possible joint distributions. 

I. Introduction 

The problem of determining the capacity region of the 
multiple access channel with correlated sources can be for- 
mulated as follows. Given a pair of correlated sources (U, V) 
described by the joint probability distribution p(u,v), and a 
discrete, memoryless, multiple access channel characterized by 
the transition probability p(y\xi,x 2 ), what are the necessary 
and sufficient conditions for the reliable transmission of n in- 
dependent identically distributed (i.i.d.) samples of the sources 
through the channel, in n channel uses, as n — > oo? 

This problem was studied by Cover, El Gamal and Salehi 
in [1], where an achievable region expressed by single-letter 
entropies and mutual informations was given. This region 
was shown to be suboptimal by Dueck [2]. Cover, El Gamal 
and Salehi [1] also provided a capacity result with both 
achievability and converse in incomputable expressions in the 
form of some n-letter mutual informations. In this paper, we 
derive an upper bound for the sum rate of this channel in a 
single-letter expression. 

The incomputability of the sum rate of Cover, El Gamal 
and Salehi scheme is due to the difficulty of characterizing 
the possible joint distributions for the ri-letter channel inputs. 
The Cover, El Gamal, Salehi converse is 



H (U,V) < h{X?,X2;Y n ) 



(1) 



This work was supported by NSF Grants CCR 03-11311. CCF 04-47613 
and CCF 05-14846; and ARL/CTA Grant DAAD 19-01-2-0011. 



where the random variables involved have a joint distribution 
expressed in the form 

n n 

l[p(u i ,v i )p(x^\u n )p(x^\v n )l[p(y i \x u ,X2i) (2) 



i.e., the sources and the channel inputs satisfy the Markov 
chain relation X? -» U n -> V n -> X$. It is difficult to 
evaluate the mutual information on the right hand side of Q 
when the joint probability distribution of the random variables 
involved is subject to (0. 

A usual way to upper bound the mutual information in Q 



1 



1 



< maxI(Xt,X 2 ;Y) 



(3) 



where the maximization in Q is over all possible Xi and 
X 2 such that X l -> U n -> V n -> X 2 . Therefore, combining 
Q and l|3}, a single-letter upper bound for the sum rate is 
obtained as, 



H{U,V) <msxI{Xx,X 2 ;Y) 



(4) 



where the maximization is over all Xi , X 2 such that X\ — > 
U n — ► V n — > X 2 . However, a closed form expression for 
p(xx,x 2 ) satisfying this Markov chain, for all U, V and n, 
seems intractable to obtain. 

Data processing inequality [3, p. 32] is an intuitive way 
to obtain a necessary condition on p(xi,x 2 ) for the above 
Markov chain constraint, i.e., we may try to solve the follow- 
ing problem as an upper bound for @ 



max I(X U X 2 ;Y) 
s.t. I(Xx;X 2 ) <I(U n ; V n ) 



(5) 



nI(U, V) 



where "s.t." line provides a constraint on the feasible set of 
p(xx,x 2 ). However, when n is large, this upper bound be- 
comes trivial as nI(U, V) quickly gets larger than I{X 1 ;X 2 ) 
for p{xi,x 2 ) even without the Markov chain constraint. Al- 
though the data processing inequality in its usual form does 
not prove useful in this problem, we will still use the basic 
methodology of employing a data processing inequality to 
represent the Markov chain constraint on the valid input 



distributions. For this, we will introduce a new data processing 
inequality. 

Spectrum analysis has been instrumental in the study of 
some properties of pairs of correlated random variables, es- 
pecially, those of the i.i.d. sequences of pairs of correlated 
random variables, e.g., common information in [4] and iso- 
morphism in [5]. In this paper, we use spectrum analysis 
to introduce a new data processing inequality. Our new data 
processing inequality provides a single-letter necessary con- 
dition for the joint distributions satisfying the Markov chain 
condition, and leads to a non-trivial single-letter upper bound 
for the sum rate of the multiple access channel with correlated 
sources. 

II. Some Preliminaries 

In this section, we provide some basic results what will be 
used in our later development. The concepts used here are 
originally introduced by Witsenhausen in [4] in the context of 
operator theory. Here, we limit ourselves to the finite alphabet 
case, and derive our results by means of matrix theory. 

We first introduce our matrix notation for probability dis- 
tributions. For a pair of discrete random variables X and 
Y, which take values in X — {x\, x%, ■ ■ ■ , x m } and y = 
{yii V2, ■ ■ • > Vn}i respectively, the joint distribution matrix 
Pxy is defined as Pxy(i,j) — Pr(X = Xi 7 Y = yj), 
where Pxy(i,j) denotes the (i,j)-th element of the matrix 
Pxy- From this definition, we have F| y = Pyx- The 
marginal distribution of a random variable X is defined as 
a diagonal matrix with Px(i,i) = Pr(X — Xi). The vector- 
form marginal distribution is defined as px{i) = Pr(X = Xi), 
i.e., px = PxG, where e is a vector of all ones. Similarly, 

- A - — - A — - 

we define p x = P x e and Px 2 = Px 2e - ^ ne conditional 
distribution of X given Y is defined in the matrix form as 
Px\ Y (i,j) = Pr(X = Xi \Y = yj ), and P X \ Y = PxyPy ■ 

We define a new quantity, Pxy, which will play an impor- 
tant role in the rest of the paper, as 

Pxy = P x h PxYPP (6) 

Our main theorem in this section identifies the spectral 
properties of Pxy- Before stating our theorem, we provide 
the following lemma, which will be used in its proof. 

Lemma 1 [6, p. 49] The spectral radius of a stochastic matrix 
is 1. A non-negative matrix T is stochastic if and only if e is 
an eigenvector of T corresponding to the eigenvalue 1. 

Theorem 1 An m x n non-negative matrix P is a joint 
distribution matrix with marginal distributions Px and Py, 
i.e., Pe = px = Px e an d P T e — py — Pye, if and only if 
the singular value decomposition (SVD) of P = P x 2 PP Y 2 
satisfies 

I 

P = UAV T =pI(Py) T + ]TA. lUl vf (7) 

i=2 



where U = [ui, . . . , U;] and V = [vi, . . . , Vj] are two unitary 
matrices, A = diag[Ai, . . . , A/] and I = min(m, n); Ui = p x , 

v i = Py, an d Ai = 1 > A2 > • • • > A; > 0. That is, all of the 
singular values of P are between and 1, the largest singular 
value of P is 1, and the corresponding left and right singular 
vectors are p x and p y . 

Proof: Let P satisfy (Q, then 

P x PP^e - P| (p|(4) T + A * u > v ^ Py 

1 

= P x Px(Py) T Py + PxY, A * u * v f v i 

i=2 

= Px (8) 

Similarly, e T P X PP Y = p Y . Thus, the non-negative matrix 

P X PP Y is a joint distribution matrix with marginal distribu- 
tions px and py. 

Conversely, we consider a joint distribution P with marginal 
distributions px and py. We need to show that the singular 

values of P lie in [0, 1], the largest singular value is equal to 

11 

1, and p\ and p Y , respectively, are the left and right singular 
vectors corresponding to the singular value 1. 

To this end, we first construct a Markov chain X — > Y — > Z 
with Pxy = Pzy = P- Note that this also implies Px = Pz, 
Pxy — Pzy = P, and Px\y = Pz\y- The special structure 
of the constructed Markov chain provides the following: 

Px\z = Px\yP Y \z = Px\yPy\x = PP Y 1 P T P X 1 

= P X (P X ^PPP) (Py i P T P X i )P X i 

= P X PP T PP (9) 

We note that the matrix Px\z is similar to the matrix PP T [7, 
p. 44]. Therefore, all eigenvalues of Px\z are the eigenvalues 
of PP T as well, and if v is a left eigenvector of Px\z corre- 
sponding to an eigenvalue //, then P x v is a left eigenvector 
of PP T corresponding to the same eigenvalue. 

We note that Px\z is a stochastic matrix, therefore, from 
Lemma ^ e is a left eigenvector of Px\z corresponding the 
eigenvalue 1, which is also equal to the spectral radius of 
Px\z- Since Px\z is similar to PP T , we have that p x is 
a left eigenvector of PP T with eigenvalue 1, and the rest 
of the eigenvalues of PP T lie in [—1,1]. In addition, PP T 
is a symmetric positive semi-definite matrix, which implies 
that the eigenvalues of PP T are real and non-negative. Since 
the eigenvalues of PP T are non-negative, and the largest 
eigenvalue is equal to 1, we conclude that all of the eigenvalues 
of PP T lie in the interval [0,1]. 

The singular values of P are the square roots of the 
eigenvalues of PP T , and the left singular vectors of P are 
the eigenvectors of PP T . Thus, the singular values of P lie 
in [0, 1], the largest singular value is equal to 1, and p x is a 



left singular vector corresponding to the singular value 1, The 
corresponding right singular vector is 

Vf = Uf P = {p\fP x h PPy h = vlPy k = (4) T (10) 

which concludes the proof. ■ 
III. A New Data Processing Inequality 

In this section, we introduce a new data processing inequal- 
ity in the following theorem. We first provide a lemma that 
will be used in its proof. 

Lemma 2 [8, p. 178] For matrices A and B 

X i {AB)<X i {A)X 1 {B) (11) 
where \{-) denotes the i-th largest singular value of a matrix. 

Theorem 2 If X -> Y -» Z, then 

Xi(Pxz) < K(Pxy)X2(Pyz) < K(Pxy) (12) 

where i = 2, . . . , rank(Pxz)- 

Proof: From the structure of the Markov chain, and from 
the definition of Pxy in we have 



Pxz = P x ~ 2 PxzP z 1 = PxyPyz 
Using for Pxz, we obtain 



(13) 



Pxz =pl(pl) T + ^(Pxz)v l {Pxz)v t (Pxz) T (14) 



i=2 




and using for Pxy and Pyz yields 

PxyPyz = (p x (Py) T + J2 >*(Pxy)MPxy)MPxy) t ^ 

■(Pzf + J2 UPyz)^{Pyz)MPyz) T \ 

i=2 J 
=Px(Pzf + (j^HPxY^iiPxY^iiPxYf^j 

* (j^*i(Prz)ui(Prz)vi(Przfj (15) 

i 

where the two cross-terms vanish since p Y is both vi(Pxy) 
and U\{Pyz), and therefore, p Y is orthogonal to both 
Vi(Pxr) and Uj(Pyz)> f° r a ll hj !• Using PI and 
equating HAi and d!5l >. we obtain 

I 

X t {Pxz)MPxz)MPxz) T 

= fcxiiPxY^iiPxY^iiPxY) 7 ^ 

X \^\i{PYz)MPYz)MPyz) T \ (16) 

The proof is completed by applying Lemma [2] to \\6\ . ■ 



IV. On i.i. d. Sequences 

Let (X n , Y n ) be a pair of i.i.d. sequences, where each pair 
of letters of these sequences satisfies a joint distribution Pxy- 
Thus, the joint distribution of the sequences is Px™y™ — Pxy> 
where A® 1 = A, A® k = A® A®^'^, and ® represents the 
Kronecker product of matrices [7]. 

From lrol, 

i _ i 

(17) 



Pxy = P x PxyP^ 



Then, 

P X n Y n = P*& = (P^PxYPjf 



(P. 



x> 



Sin n»n l p 2 \Q 
r XY\ r Y I 



(18) 



We also have P x » = (Px)®" and P Yr = {P Y )® n ■ Thus, 



Px~y~=P x ~P* 



(P x ')®"(Pji)®»p!»(p£)®»(P y 3)« 



r XY 



(19) 



Applying SVD to Px n Y n > we have 

P x „ y „ = £/„A„U„ T = P|™ = [/®™A®"(F®") T (20) 

From the uniqueness of the SVD, we know that U n = U® n , 
A„ = A®" and V n = V® n . Then, the ordered singular values 
of Px"Y n are 

{1,X 2 (Pxy),...,X 2 (Pxy),...} 

where the second through the n + 1-st singular values are all 
equal to X 2 (P X y)- 

V. A Necessary Condition 
As stated in Section |I] the sum rate can be upper bounded 



as 



H(U,V) <m&xI(X 1 ,X 2 ;Y) 



(21) 



where the maximization is over all possible X\ and A 2 that 
satisfy the Markov chain X\ — > U n — > V n -> X 2 . 

From Theorem |2] in Section [HI] we know that if X\ — » 

U n -> U™ -> A 2 , then, for i = 2, . . . ,rank(P Xl x 2 ), 

Ai < A2(Px 1 ^)A i (P^»)A 2 (fW 2 ) (22) 

We showed in Section HV1 that Xi(Pu^v) < A 2 (P;yy) for 
i > 2, and A* (Pe/« y» ) = \ 2 {P UV ) for i = 2,...,n + 1. 
Therefore, for i = 2, . . . ) rank(P Xl x 2 )> we have 



\{Px lX2 ) < X 2 {P XlUn )X 2 (P uv )X 2 (P vnX2 ) 



(23) 



From Theorem we know that X 2 (Px 1 u n ) < 1 and 
X 2 (Pv n x 2 ) < !• Next, in Theorem [3] we determine that the 
least upper bound for X 2 (Px 1 u™) and X 2 (Pv"X 2 ) i s also 1. 

Theorem 3 Let F(n, P Xl ) be the set of all joint distributions 
for X\ and U n with a given marginal distribution for X\, 
Px x - Then, 



sup X 2 (P Xl U") = 1 

F(n,P Xl ), n=l,2,... 



(24) 



The proof of Theorem [3] is given in the Appendix. 

Combining i23\ and Theorem [3] we obtain the main result 
of our paper, which is stated in the following theorem. 

Theorem 4 If a pair of i.i.d. sources (U, V) with joint dis- 
tribution Pjjv can be transmitted reliably through a dis- 
crete, memoryless, multiple access channel characterized by 
P Y\x 1 x 2 , then 

H(U,V) <I(X ll X 2 ;Y) (25) 
for some (X\, X 2 ) with 

*i(Px lXa ) < H p uv), i = 2,...,mnk(P XlX2 ), (26) 

VI. Some Simple Examples 

We consider a multiple access channel where the alphabets 
of X\, X2 and Y are all binary, and the channel transition 
probability matrix p(y\xi,X2) is given as 



y\xix 2 


11 


10 


01 


00 


1 


1 


1/2 


1/2 











1/2 


1/2 


1 



The following is a trivial upper bound, which we provide as 
a benchmark, 

max I(X 1 ,X 2 :Y) = 1 (27) 

p(x 1 ,x 2 ) 

where the maximization is over all binary bivariate distribu- 
tions. The maximum is achieved by P(Xi = 1, X2 = 1) = 
P(Xi — 0,X 2 = 0) = 1/2. We note that this upper bound 
does not depend on the source distribution. 

First, we consider a binary source (U, V) with the following 
joint distribution p(u, v) 



u\v 


1 





1 


1/3 


1/6 





1/6 


1/3 



In this case, H(U,V) = 1.92. We first note, using the trivial 
upper bound in J27L that, it is impossible to transmit this 
source through the given channel reliably. The upper bound 
we developed in this paper gives 2/3 for this source. We also 
note that, for this case, our upper bound coincides with the 
single-letter achievability expression given in [1], which is 

H(U,V)<I(X l ,X 2 ;Y) (28) 

where Xi,X2 are such that X% — ► U — > V — > X2 holds. 
Therefore, for this case, our upper bound is the converse, as 
it matches the achievability expression. 

Next, we consider a binary source (U, V) with the following 
joint distribution p(u, v) 



u\v 


1 





1 





0.1 





0.1 


0.8 



In this case, H(U, V) = 0.92, the single-letter achievability 
in (1281 reaches 0.51 and our upper bound is 0.56. The gap 
between the achievability and our upper bound is quite small. 



We note that, in this case, the trivial upper bound in J27t 
fails to test whether it is possible to have reliable transmission 
or not, while our upper bound determines conclusively that 
reliable transmission is not possible. 

Finally, we consider a binary source (U, V) with the fol- 
lowing joint distribution p(u, v) 



u\v 


1 





1 





0.85 





0.1 


0.05 



In this case, H(U, V) = 0.75, the single-letter achievability 
expression in d28l gives 0.57 and our upper bound is 0.9. We 
note that the joint entropy of the sources falls into the gap 
between the achievability expression and our upper bound, 
which means that we cannot conclude whether it is possible 
(or not) to transmit these sources through the channel reliably. 

VII. Conclusion 

In this paper, we investigated the problem of transmitting 
correlated sources through a multiple access channel. We 
utilized the spectrum analysis to develop a new data processing 
inequality, which provided a single-letter necessary condi- 
tion for the joint distributions satisfying the Markov chain 
condition. By using our new data processing inequality, we 
developed a new single-letter upper bound for the sum rate of 
the multiple access channel with correlated sources. 

Appendix 
Proof of Theorem[3] 

To find sup ^s(PxiU n )> we need to exhaust 

F(n,P Xl ), n=l,2,... 

the sets F(n, Pxi) with n > 1. In the following, we show 
that it suffices to check only the asymptotic case. 

For any joint distribution Px x u n G F(n,PxJi we attach 
an independent U, say U n +i, to the existing n-sequence, 
and get a new joint distribution Px 1 u n + 1 = Px x v n ® Pu, 
where pu is the marginal distribution of U in the vector form. 
By arguments similar to those in Section II VI we have that 
K{Px 1 u^+ 1 ) = K(PxiU")- Therefore, for every P Xl w G 
F{n,Px-i), there exists some p x 1 u n + 1 G F(n+1, Pxx), such 
that \i{P Xl u^) = KiPx.un). Thus, 

sup \ 2 (Px xU ™)< sup X 2 (P Xl u^) (29) 

F(n,P Xl ) F(n+l,P Xl ) 

From ( I29> . we see that sup \2(PxiU n ) is monotonically 

F{n,P Xl ) 

non-decreasing in n. We also note that \2{Px 1 U n ) is upper 
bounded by 1 for all n, i.e., \2(Px 1 u n ) < 1- Therefore, 

sup A 2 (Px 1 [/™) = Hm sup \2(Px 1 U") 

F(n,P Xl ), Ti=l,2,... n ^°°F(n,P Xl ) 

(30) 

To complete the proof, we need the following lemma. 

Lemma 3 [4] \2(Pxy) — 1 if and only if Pxy decomposes. 
By Pxy decomposes, we mean that there exist sets S% € X, 
S 2 G y, such that P{Si), P{X - Si), P(S 2 ), P(y - S 2 ) are 
positive, while P((X - Si) x S 2 ) = P(Si x (y - S 2 )) = 0. 



In the following, we will show by construction that there 
exists a joint distribution that decomposes asymptotically. 

For a given marginal distribution Px 1 , we arbitrarily choose 
a subset Si from the alphabet of X\. We find a set S 2 
in the alphabet of U n such that P(Si) = P{S 2 ) if it is 
possible. Otherwise, we pick S 2 such that \P(Si) — P(5a)| is 
minimized. We denote S(n) to be the set of all subsets of the 
alphabet of U n and we also define P max = maxPr(s) for all 
s GU. Then, we have 



mm 

S 2 CS(n 



\P(S 2 )-P(Si)\<P£ 



(31) 



We construct a joint distribution for Xi and U n as follows. 
First, we construct the joint distribution P l corresponding 
to the case where Xi and U n are independent. Second, we 
rearrange the alphabets of X\ and U n and group the sets Si, 
X1-S1, S 2 and U n - S 2 as follows 



P l = 



pi 
r 11 



pi 
r 12 
pi 
r 22 



(32) 



where P{ x , P[ 2 , P 2 \, P 22 correspond to the sets S\ x S 2 , 

Si x {U n - S 2 ), {X 1 - Si) x S 2 , {X 1 - Si) x (U n - S 2 ), 



respectively. Here, we assume that P(S 2 ) > P(S±) 

PjiP(Si) 



scale these four sub-matrices as P\\ — prg^prg^ 

P _ Pjl(l-P(S2)) 

r 22 — 



P21 



(1-P(S 1 ))P(S 2 ) 

p 



(1-P(S 1 ))(1-P(S 2 )) 

P11 
P21 P22 



Then, we 
P12 = 0, 
and let 

(33) 



We note that P is a joint distribution for X\ and U n with the 
given marginal distributions. Next, we move the mass in the 
sub-matrix P21 to Pn, which yields 



pl A 



" P{i 


= P+E = 


' Pu 





+ 


E11 


P22 




P22 




-E21 










(3^) 



P 1 ' 1 (P(g 2 )-P(Si)) 
P(S 1 )P(S 2 ) ' 



and P^ = 



where E21 — P21, En 

PuP(S 2 ) ^ ^ e jgjjQjg an( j as me mar gi na i distri- 



PiSx) 

butions of P'. We note that P{j n = P Un and P' x% = P Xl M 
where M is a scaling diagonal matrix. The elements in the 
set Si are scaled up by a factor of p^f^y, 



and those in the set 



Xi — Si are scaled down by a factor of j 



1-P(S 2 ) 
-P(Si)- 



Then, 



P' = M-2P + M- 



(35) 



We will need the following lemmas in the remainder of our 
derivations. Lemma [5] can be proved using techniques similar 
to those in the proof of Lemma 0] [9]. 

Lemma 4 [9] If A' = A + E, then \\i{A')-\{A)\ < \\E\\ 2 , 
where \\E\\2 is the spectral norm of E. 

Lemma 5 If A 1 = MA, where M is an invertible matrix, 
then IIM- 1 !^ 1 < X t (A')/X t (A) < \\M\\ 2 . 

Since P' decomposes, using Lemma |3] we conclude that 
\ 2 {P') = 1. We upper bound \ \P X ^ E P w ? \\ 2 as follows, 



\P v ?EP un '\\ 2 < WP^EP- 



(36) 



where || ■ \ \p is the Frobenius norm. Combining d32l and ( I34K 

we have 

(P(S 2 )-P(Si)) t 



\P Xi 'EP Un '\\ F < 



P[P(S 2 



(37) 



where P[ = min(P(5i),l — P(Si)). Since P l corresponds 

_ 1 . _ 1 

to the independent case, we have \\P x 2 P l P u , 
Then, from OD, and (03, we obtain 



P x ?EP T --\\ 2 < Cl P r n 



p = 1 from 
(38) 



where ci = j> 
From Lemma 



na|l 



T' 

we have 



iM-'P^EP^Wz = iX^M-iP^EPu' 

'i-p(SiY 1 



< 



pn a. pr, 



From Lemma 0] we have 

1 - c 2 P n f ax < A 2 (M" = P) < 1 + c 2 pj 
We upper bound ||M"5|| 2 as follows 

IIAf'llo = 



(39) 



(40) 



<1 




Similarly, \\M 2 



li 1 



v^) * ' ~°~ max (41) 

> 1 — CiPmax- From Lemma|5] we have 



(1 - c 4 P^i) < - ^ < (1 + csP^i) (42) 
\ 2 {M 2PJ 

Since P is a joint distribution matrix, from Theorem ^ we 
know that A 2 (P) < 1. Therefore, we have 

(1 - c 4 P n ^)(i _ C2 P»/2) < x 2 (P) < 1 (43) 

When Pmax < 1, corresponding to the non-trivial case, 
linin—Kx, Pmax = 0, and using d30i . J24I follows. 

The case P(S 2 ) < P(Si) can be proved similarly. ■ 
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